Significance Analysis of Prognostic Signatures
نویسندگان
چکیده
A major goal in translational cancer research is to identify biological signatures driving cancer progression and metastasis. A common technique applied in genomics research is to cluster patients using gene expression data from a candidate prognostic gene set, and if the resulting clusters show statistically significant outcome stratification, to associate the gene set with prognosis, suggesting its biological and clinical importance. Recent work has questioned the validity of this approach by showing in several breast cancer data sets that "random" gene sets tend to cluster patients into prognostically variable subgroups. This work suggests that new rigorous statistical methods are needed to identify biologically informative prognostic gene sets. To address this problem, we developed Significance Analysis of Prognostic Signatures (SAPS) which integrates standard prognostic tests with a new prognostic significance test based on stratifying patients into prognostic subtypes with random gene sets. SAPS ensures that a significant gene set is not only able to stratify patients into prognostically variable groups, but is also enriched for genes showing strong univariate associations with patient prognosis, and performs significantly better than random gene sets. We use SAPS to perform a large meta-analysis (the largest completed to date) of prognostic pathways in breast and ovarian cancer and their molecular subtypes. Our analyses show that only a small subset of the gene sets found statistically significant using standard measures achieve significance by SAPS. We identify new prognostic signatures in breast and ovarian cancer and their corresponding molecular subtypes, and we show that prognostic signatures in ER negative breast cancer are more similar to prognostic signatures in ovarian cancer than to prognostic signatures in ER positive breast cancer. SAPS is a powerful new method for deriving robust prognostic biological signatures from clinically annotated genomic datasets.
منابع مشابه
A Simple but Highly Effective Approach to Evaluate the Prognostic Performance of Gene Expression Signatures
BACKGROUND Highly parallel analysis of gene expression has recently been used to identify gene sets or 'signatures' to improve patient diagnosis and risk stratification. Once a signature is generated, traditional statistical testing is used to evaluate its prognostic performance. However, due to the dimensionality of microarrays, this can lead to false interpretation of these signatures. PRIN...
متن کاملHomogeneous Datasets of Triple Negative Breast Cancers Enable the Identification of Novel Prognostic and Predictive Signatures
BACKGROUND Current prognostic gene signatures for breast cancer mainly reflect proliferation status and have limited value in triple-negative (TNBC) cancers. The identification of prognostic signatures from TNBC cohorts was limited in the past due to small sample sizes. METHODOLOGY/PRINCIPAL FINDINGS We assembled all currently publically available TNBC gene expression datasets generated on Af...
متن کاملPrognostic immune-related gene models for breast cancer: a pooled analysis
Breast cancer, the most common cancer among women, is a clinically and biologically heterogeneous disease. Numerous prognostic tools have been proposed, including gene signatures. Unlike proliferation-related prognostic gene signatures, many immune-related gene signatures have emerged as principal biology-driven predictors of breast cancer. Diverse statistical methods and data sets were used fo...
متن کاملClassification of Non-Small Cell Lung Cancer Using Significance Analysis of Microarray-Gene Set Reduction Algorithm
Among non-small cell lung cancer (NSCLC), adenocarcinoma (AC), and squamous cell carcinoma (SCC) are two major histology subtypes, accounting for roughly 40% and 30% of all lung cancer cases, respectively. Since AC and SCC differ in their cell of origin, location within the lung, and growth pattern, they are considered as distinct diseases. Gene expression signatures have been demonstrated to b...
متن کاملComparison Analysis of Particulate Matters in a Micro Environment
Different approaches of source apportionment of dust fractions have been reported world-over. Predicting source categories within receptor chemical profiles using regression and factor analysis using PCA has been reported to evaluate possible source/routes of air pollution mass. The present study is focused on the application of all three approaches to investigate higher degrees of significance...
متن کامل